In the MySQL world, there are a couple of ways to take backups of your data. Backups in MySQL can be categorized into two distinct categories: logical or physical. MySQL also comes with a lot of tools helping us achieve our backup objectives: we have walked you through one of the most popular offerings in this space – Percona XtraBackup – in the second part of this series, and the first part of this series covered one of the main logical MySQL backup tools called mysqldump
. As far as backups are concerned, though, we are far from done – for example, mysqldump
alone has a brother called mysqlpump
. Since the two CLI-based backup tools are frequently confused with each other, the aim of this blog is to clear the confusion up a little.
What is mysqlpump?
mysqlpump
is a backup utility that is used via the command-line interface. The tool is very similar to mysqldump
in that it provides us with the ability to take logical backups, but also different at the same time – the goal of mysqlpump
is to be an extendable, parallel-supporting replacement of mysqldump
. In their blog from 2015, MySQL team said that one of the primary aims of introducing mysqlpump
was not be forced to implement legacy functionality that is provided by mysqldump
.
mysqlpump
was introduced to the world of MySQL in 2015 – the tool was added to MySQL 5.7.8.
Using mysqlpump
The most basic use case of mysqlpump
looks like this (the tool can be used in the same fashion both on Linux and on Windows):
mysqlpump
is designed in such a way that it will only dump tables that were created by a user – in other words, it will refrain from backing up tables that were created internally by MySQL (it won‘t touch any of the internal tables.)
The one feature that is exclusive to mysqlpump
, though, is the ability to issue backups using multiple threads at once – to figure out how many threads your system has, follow these steps:
- Linux users – issue a command like
lscpu
orlscpu | egrep ‘Thread|CPU\(s\)’
and look at the output (you should see something similar to “Thread(s) per core.”) - Windows users – open up task manager and look at the number next to “Logical processors” – they represent the amount of threads in the system:
Once you know the number of threads in your system, think about the number of threads you wish (and can) allocate to mysqlpump
– things might seem difficult and complex, but your thought process shouldn’t take more than a couple of seconds – mysqlpump
is usually quick to complete too.
Once you know the number of threads in your infrastructure, configure the number of threads that can be used together with mysqlpump
. Threads are to be configured using the --default-parallelism
and --parallel-schemas
options. Also, consider providing the username and password that you’re using to access MySQL itself through my.cnf for secure access: passwords provided through the CLI can be observed by accessing the history of issued commands. If you don’t want to do that, feel free to provide a username and a password, but be aware that mysqlpump
itself will issue you warnings if you do so:
mysqlpump --default-parallelism=[threads] > data.sql
Since the “specialty” of mysqlpump
is to back up data using multiple threads, we can also define the number of threads that we want the tool to use to dump a specific database – this functionality of mysqlpump
can be exceptionally useful if we’re working with bigger sets of data, but need a logical backup to be taken.
mysqlpump – the Details
- To specify more threads to dump a large database and less threads for smaller databases, look at the option named
parallel-schemas
. If you define your parameters like so, 6 threads will be used for larger databases, and 2 for smaller ones:
mysqlpump --parallel-schemas=6:large_db1,large_db2 --parallel_schemas=2:small_db [--default-parallelism=x] > backup.sql
Note the--default-parallelism
option: a default number of threads when dumping other databases (databases that are not defined) can also be specified. - To back up only specific databases, define your query like so (here db_1 and db_2 represent two separate databases):
mysqlpump –-databases db_1 db_2 > databases.sql
You can also accomplish the same goal like so:
mysqlpump --include-databases=db_1,db_2 --result-file=dump.sql
mysqlpump
can also perform an “empty” dump (meaning that it can only back up the schema of the database, but not the data contained within):
mysqlpump --include-databases=data_1,db2,db3 --skip-dump-rows --result-file=mydata.sql
mysqlpump
also gives us the ability to use wildcards inside of any parameter. That means that if we have a lot of databases, backing a part of them (only those starting with the letter x, for example) is a breeze:
mysqlpump --include-databases=x% --result-file=s_databases.sql
- We can also exclude databases from being backed up like so (the following query would exclude all databases starting with
test_
and back up all the rest):
mysqlpump --exclude-databases=test_% --result-file=data.sql
- We can also work with entire patterns (the following query would exclude all tables matching the
demo
pattern from the beginning):
mysqlpump --exclude-tables=__demo --result-file=backup.sql
- And last but not least,
mysqlpump
is also able to work with events and routines – all the same, just specify events or routines instead of databases and tables:
mysqlpump --[include|exclude]-[events|routines]=title1,title2 --result-file=verycoolbackup.sql
mysqlpump
can do a number of other things as well – all of the information on mysqlpump
, as with everything MySQL-related, can be found at the documentation.
The Downsides of mysqlpump
However, as powerful as mysqlpump
is, it’s also not without its weaknesses. Research made by Giuseppe Maxia back in September 2015 suggests that mysqlpump
is faster than mysqldump
, but only slightly – the blogger provides an example where mysqldump
takes 3 minutes and 33 seconds to execute, whereas mysqlpump takes 2 minutes and 55 seconds – the difference is there, but we presume that the blogger has expected it to be way bigger than it was.
The research was made with approximately 20 million rows inside of the tables in the database – a figure big enough for both scripts to handle, but not large enough to invoke SELECT * INTO OUTFILE
(a command used to back up bigger sets of data.) This suggests that mysqlpump
is good if we find ourselves needing to use specific options not available to mysqldump
, but doesn’t provide much of an upside otherwise.
Summary
mysqlpump
entered the MySQL scene in 2015 with the release of MySQL 5.7.8. At first it was thought that it could be a reliable replacement for mysqldump
and also offer a couple of “exotic” options not available to its counterpart, but as time went by mysqlpump
proved that it still needs more refinement to be considered a reliable replacement for mysqldump.
Both mysqldump
and mysqlpump
are part of the MySQL’s ecosystem to this day, yet people steer towards mysqldump
more than they do towards mysqlpump
– while mysqlpump
has its use cases, it is thought that its speed is not significant enough to outweigh its counterpart.
We hope that you’ve enjoyed reading this article, stay around the Redgate blog to learn more about everything database-related, and until next time.
Load comments